XLING: Matching Query Sentences to a Parallel Corpus using Topic Models for WSD

نویسندگان

  • Liling Tan
  • Francis Bond
چکیده

This paper describes the XLING system participation in SemEval-2013 Crosslingual Word Sense Disambiguation task. The XLING system introduces a novel approach to skip the sense disambiguation step by matching query sentences to sentences in a parallel corpus using topic models; it returns the word alignments as the translation for the target polysemous words. Although, the topic-model base matching underperformed, the matching approach showed potential in the simple cosine-based surface similarity matching.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semi-supervised Word Sense Disambiguation with Neural Models

Determining the intended sense of words in text – word sense disambiguation (WSD) – is a long-standing problem in natural language processing. In this paper, we present WSD algorithms which use neural network language models to achieve state-of-the-art precision. Each of these methods learns to disambiguate word senses using only a set of word senses, a few example sentences for each sense take...

متن کامل

Use of Combined Topic Models in Unsupervised Domain Adaptation for Word Sense Disambiguation

Topic models can be used in an unsupervised domain adaptation for Word Sense Disambiguation (WSD). In the domain adaptation task, three types of topic models are available: (1) a topic model constructed from the source domain corpus: (2) a topic model constructed from the target domain corpus, and (3) a topic model constructed from both domains. Basically, three topic features made from each to...

متن کامل

Multi-level Bootstrapping For Extracting Parallel Sentences From a Quasi-Comparable Corpus

We propose a completely unsupervised method for mining parallel sentences from quasi-comparable bilingual texts which have very different sizes, and which include both in-topic and off-topic documents. We discuss and analyze different bilingual corpora with various levels of comparability. We propose that while better document matching leads to better parallel sentence extraction, better senten...

متن کامل

Study of Word Sense Disambiguation System that uses Contextual Features - Approach of Combining Associative Concept Dictionary and Corpus -

We propose a Word Sense Disambiguation (WSD) method that accurately classifies ambiguous words to concepts in the Associative Concept Dictionary (ACD) even when the test corpus and the training corpus for WSD are acquired from different domains. Many WSD studies determine the context of the target ambiguous word by analyzing sentences containing the target word. However, they offer poor perform...

متن کامل

Learning Word Sense Distributions, Detecting Unattested Senses and Identifying Novel Senses Using Topic Models

Unsupervised word sense disambiguation (WSD) methods are an attractive approach to all-words WSD due to their non-reliance on expensive annotated data. Unsupervised estimates of sense frequency have been shown to be very useful for WSD due to the skewed nature of word sense distributions. This paper presents a fully unsupervised topic modelling-based approach to sense frequency estimation, whic...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013